159 research outputs found

    Finding Stories in 1,784,532 Events: Scaling Up Computational Models of Narrative

    Get PDF
    Information professionals face the challenge of making sense of an ever increasing amount of information. Storylines can provide a useful way to present relevant information because they reveal explanatory relations between events. In this position paper, we present and discuss the four main challenges that make it difficult to get to these stories and our first ideas on how to start resolving them

    MAG: A Multilingual, Knowledge-base Agnostic and Deterministic Entity Linking Approach

    Full text link
    Entity linking has recently been the subject of a significant body of research. Currently, the best performing approaches rely on trained mono-lingual models. Porting these approaches to other languages is consequently a difficult endeavor as it requires corresponding training data and retraining of the models. We address this drawback by presenting a novel multilingual, knowledge-based agnostic and deterministic approach to entity linking, dubbed MAG. MAG is based on a combination of context-based retrieval on structured knowledge bases and graph algorithms. We evaluate MAG on 23 data sets and in 7 languages. Our results show that the best approach trained on English datasets (PBOH) achieves a micro F-measure that is up to 4 times worse on datasets in other languages. MAG, on the other hand, achieves state-of-the-art performance on English datasets and reaches a micro F-measure that is up to 0.6 higher than that of PBOH on non-English languages.Comment: Accepted in K-CAP 2017: Knowledge Capture Conferenc

    ICON: an Ontology for Comprehensive Artistic Interpretations

    Get PDF
    In this work, we introduce ICON, an ontology that models artistic interpretations of artworks’ subject matter (i.e. iconographies) and meanings (i.e. symbols, iconological aspects). Developed by conceptualizing authoritative knowledge and notions taken from Panofsky’s levels of interpretation theory, ICON ontology focuses on the granularity of interpretations. It can be used to describe an interpretation of an artwork from the Pre-iconographical, Icongraphical, and Iconological levels. Its main classes have been aligned to ontologies that come from the domains of cultural descriptions (ArCo, CIDOC-CRM, VIR), semiotics (DOLCE), bibliometrics (CITO), and symbolism (Simulation Ontology), to grant a robust schema that can be extendable using additional classes and properties coming from these ontologies. The ontology was evaluated through competency questions that range from simple recognition on a specific level of interpretation to complex scenarios. Data written using this model was compared to state-of-the-art ontologies and schemas to both highlight the current lack of a domain-specific ontology on art interpretation and show how our work fills some of the current gaps. The ontology is openly available and compliant with FAIR principles. With our ontology, we hope to encourage digital art historians working for cultural institutions in making more detailed linked open data about the content of their artefacts, to exploit the full potential of Semantic Web in linking artworks through not only subjects and common metadata, but also specific symbolic interpretations, intrinsic meanings, and the motifs through which their subjects are represented. Additionally, by basing our work on theories made by different art history scholars in the last century, we make sure that their knowledge and studies will not be lost in the transition to the digital, linked open data era

    Evaluating named entity recognition tools for extracting social networks from novels

    Get PDF
    The analysis of literary works has experienced a surge in computer-assisted processing. To obtain insights into the community structures and social interactions portrayed in novels, the creation of social networks from novels has gained popularity. Many methods rely on identifying named entities and relations for the construction of these networks, but many of these tools are not specifically created for the literary domain. Furthermore, many of the studies on information extraction from literature typically focus on 19th and early 20th century source material. Because of this, it is unclear if these techniques are as suitable to modern-day literature as they are to those older novels. We present a study in which we evaluate natural language processing tools for the automatic extraction of social networks from novels as well as their network structure. We find that there are no significant differences between old and modern novels but that both are subject to a large amount of variance. Furthermore, we identify several issues that complicate named entity recognition in our set of novels and we present methods to remedy these. We see this work as a step in creating more culturally-aware AI systems

    A Proposal for a Two-Way Journey on Validating Locations in Unstructured and Structured Data

    Get PDF
    The Web of Data has grown explosively over the past few years, and as with any dataset, there are bound to be invalid statements in the data, as well as gaps. Natural Language Processing (NLP) is gaining interest to fill gaps in data by transforming (unstructured) text into structured data. However, there is currently a fundamental mismatch in approaches between Linked Data and NLP as the latter is often based on statistical methods, and the former on explicitly modelling knowledge. However, these fields can strengthen each other by joining forces. In this position paper, we argue that using linked data to validate the output of an NLP system, and using textual data to validate Linked Open Data (LOD) cloud statements is a promising research avenue. We illustrate our proposal with a proof of concept on a corpus of historical travel stories

    MUSTI-Multimodal Understanding of Smells in Texts and Images at MediaEval 2022

    Get PDF
    MUSTI aims to collect information about smell from digital text and image collections from the 17th to 20th century in a multilingual setting. More precisely, MUSTI studies the relatedness of evocation of smells (smell sources being identified, objects being detected, gestures being mentioned or recognized) between texts and images. The main task is a binary classification task and entails identifying whether a pair of image and a text snippet contains the same smell source independent of what is the smell source. An optional sub-task is the determination of the smell sources that make the respective pair related
    corecore